Skip to content

Conversation

@bbernays
Copy link
Contributor

@bbernays bbernays commented Dec 9, 2025

Summary


Use the following steps to ensure your PR is ready to be reviewed

  • Read the contribution guidelines πŸ§‘β€πŸŽ“
  • Run go fmt to format your code πŸ–Š
  • Lint your changes via golangci-lint run 🚨 (install golangci-lint here)
  • Update or add tests πŸ§ͺ
  • Ensure the status checks below are successful βœ…

@bbernays bbernays requested a review from a team as a code owner December 9, 2025 17:34
@bbernays bbernays requested a review from hermanschaaf December 9, 2025 17:34
@hermanschaaf
Copy link
Member

This is an interesting idea!

But I want to note that even having concurrency = 1 is not a guarantee you won't hit rate limits, e.g. for endpoints where the rate limits are 1 request per second (we may still make 3-4 requests in that second, just not concurrently).

I think, rather than tuning concurrency, which is a bit of a blunt instrument, one way or another we need a rate.Limiter, with an interface that allows you to define how many requests to allow for the table / client per time period. Ideally we would aim to stay at around 50-80% of the rate limits imposed by providers, so as to not cause disruption for our users (or other users of the APIs within their company).

The other thing I note here is that this is currently only implemented for the DFS scheduler, which is (IIRC) not used most of the time nowadays.

@erezrokah
Copy link
Member

erezrokah commented Dec 10, 2025

this is currently only implemented for the DFS scheduler, which is (IIRC) not used most of the time nowadays.

All other schedulers except the queue one wraps the DFS scheduler (they re-order the table client pairs before calling the DFS logic).

Agree that in some cases concurrency won't be good enough. I think the best approach would actually be to handle rate limiting in each service SDK, since there we can read rate limiting headers and see how much quota we have left, and prevent hitting it.

If the service doesn't expose the rate limiting data, we should use a rate limiter for requests per second.

We know that even relaying on static documented rate limits is no so reliable as it's not consistent between accounts, and the docs sometimes don't reflect the reality

@bbernays bbernays closed this Dec 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants